Research Paper: Representing Information in Patient Reports Using Natural Language Processing and the Extensible Markup Language

نویسندگان

  • Carol Friedman
  • George Hripcsak
  • Lyudmila Shagina
  • Hongfang Liu
چکیده

OBJECTIVE To design a document model that provides reliable and efficient access to clinical information in patient reports for a broad range of clinical applications, and to implement an automated method using natural language processing that maps textual reports to a form consistent with the model. METHODS A document model that encodes structured clinical information in patient reports while retaining the original contents was designed using the extensible markup language (XML), and a document type definition (DTD) was created. An existing natural language processor (NLP) was modified to generate output consistent with the model. Two hundred reports were processed using the modified NLP system, and the XML output that was generated was validated using an XML validating parser. RESULTS The modified NLP system successfully processed all 200 reports. The output of one report was invalid, and 199 reports were valid XML forms consistent with the DTD. CONCLUSIONS Natural language processing can be used to automatically create an enriched document that contains a structured component whose elements are linked to portions of the original textual report. This integrated document model provides a representation where documents containing specific information can be accurately and efficiently retrieved by querying the structured components. If manual review of the documents is desired, the salient information in the original reports can also be identified and highlighted. Using an XML model of tagging provides an additional benefit in that software tools that manipulate XML documents are readily available.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Disseminating Natural Language Processed Clinical Narratives

Through Natural Language Processing (NLP) techniques, information can be extracted from clinical narratives for a variety of applications (e.g., patient management). While the complex and nested output of NLP systems can be expressed in standard formats, such as the eXtensible Markup Language (XML), these representations may not be directly suitable for certain end-users or applications. The av...

متن کامل

New Approaches in Philips ECG Database Management System Design

Electrocardiogram (ECG) databases play an important role in medical research, pharmaceutical research, medical education and health care. Due to growing demands in research, training, and health care, designing and managing such ECG databases has become a complex problem. This paper reports on the new design approach and the new application model of the Philips ECG Management System (EMS). The ...

متن کامل

Representing nested semantic information in a linear string of text using XML

XML has been widely adopted as an important data interchange language. The structure of XML enables sharing of data elements with variable degrees of nesting as long as the elements are grouped in a strict tree-like fashion. This requirement potentially restricts the usefulness of XML for marking up written text, which often includes features that do not properly nest within other features. We ...

متن کامل

The XML Framework and Its Implications for the Development of Natural Language Processing Tools

The eXtensible Markup Language (XML) (Bray, et al., 1998) is the emerging standard for data representation and exchange on the World Wide Web. The XML Framework includes very powerful mechanisms for accessing and manipulating XML documents that are likely to significantly impact the development of tools for processing natural language and annotated corpora.

متن کامل

New Methods and Tools for the World Wide Web Search

Explosive growth of the World Wide Web as well as its heterogeneity call for powerful and easy to use search tools capable to provide the user with a moderate number of relevant answers. This paper presents analysis of key aspects of recently developed Web search methods and tools: visual representation of subject trees, interactive user interfaces, linguistic approaches, image search, ranking ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of the American Medical Informatics Association : JAMIA

دوره 6 1  شماره 

صفحات  -

تاریخ انتشار 1999